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Abstract — Image Retrieval is the process of retrieving the 
most closely matched images automatically by extracting the 
basic features such as edge, shape, color and textures from the 
query image. The proposed image retrieval system is used 
texture feature by using grey - level co-occurrence matrix 
(GLCM) and Color Co - occurrence matrix (CCM). The GLCM 
and CCM separately combined with a color feature with the use 
of quantization of HSV color space. The multi-feature extraction 
is achieved through the Euclidean distance classifier. The 
proposed system performance is also measured by conducting 
experiments in different ways. 

Index Terms — Feature extraction, Texture, Image retrieval, 
Euclidian distance 

I. Introduction 

Texture is another important property of images. Texture is a 
powerful regional descriptor, which helps in the retrieval 
process. Texture, on its own does not have the capability of 
finding similar images, but it can be used to classify textured 
images from non-textured ones and then be combined with 
another visual attribute like color to make the retrieval more 
effective. Texture has been one of the most important 
characteristic which has been used to classify and recognize 
objects and have been used in finding similarities between 
images in multimedia databases [1]. Various texture 
representations have been investigated in pattern recognition 
and computer vision. Basically, texture representation 
methods can be classified into two categories: structural and 
statistical. Structural methods, including morphological 
operator and adjacency graph, they describe texture by 
identifying structural primitives and their placement rules. 
They tend to be most effective when applied to textures that 
are very regular. Statistical methods, including Fourier power 
spectra, co-occurrence matrices, shift-invariant principal 
component analysis (SPCA), Tamura feature, Wold 
decomposition, Markov random field, fractal model, and 
multi-resolution filtering techniques such as Gabor and 
wavelet transform. Characterize texture by the statistical 
distribution of the image intensity [2]. There are many 
researchers worked on CBIR survey [3]-[5], texture feature 
extraction [9] [16], multi-feature [11]-[12] algorithms for 
retrieving image. 
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A. Gray level co-occurrence matrix: 

Gray level co-occurrence matrix (GLCM) is well known and 
widely used methods to extract texture feature [18]. The 
co-occurrence matrix is defined by joint probability density of 
two pixels which have different positions. It not only reflects 
the brightness distribution characteristics, but also shows 
position distribution characteristics of pixels which have the 
same brightness or close to it. The co-occurrence matrix is 
second order statistical characteristics related to image 
brightness changes. It is the foundation that local pattern and 
arrangement rules of images are analysis. For a digital image f 

of size M x N , which is denoted as /(a, _y). It’s gray level 

is defined as P(i, j I d,9 ). The Gray Level co-occurrence 
Matrix is defined as 

/?(/, j I <i,0) =#{(xl, yl),(x2,y2) (=MxN 

/(xl, yl) = /, 7(x2, y 2) = j, I xl — x2 1= 0,1 yl — y2 I = d) 

(1) 

P(i , j I d, 45) =#{(xl, yl),(x2, y2) e M x N 

/(xl, yl) = i , I(x 2, y2) = j, (xl — x2 = d, yl — y2 = —d) 

or(x\ — x2 = —d, yl — y2 = d )} 

(2) 

P(i, j I d ,90 =#{ (xl, yl),(x2, y2) e M xN 

/(xl,yl) = i,I(x2,y2) = i,I(x2,y2) = j, I xl-x2 1= d, I yl - y21= 0} 

(3) 

p(i , j I <Z,135) =#{(xl, yl),(x2, y2) ^ M x N 

/(xl, yl) = i, I(x 2, y2) = j, (xl — x2 = d, yl — y2 = d) 

or(x\ — x2 = — d, yl — y2 = — d) } 

(4) 

Where the #{} is the number of occurrences of the pair of 
gray level i and j, which are a distance d apart. The angle is 
denoted as 6 between the pair of gray level and the axis. 

(6 — 0 ,45 ,90 ,135 four directions). So this Gray level 
Co-occurrence is defined as P(i, j \ d, 6) according to the 
distance d and the angle 6 . 

An Example of Gray Level co-occurrence Matrix, the 
following Figure 4-1 shows how co-occurrence matrix 
calculates the first three values in a Gray Level Co-occurrence 
Matrix. In the output Gray Level co-occurrence Matrix, 
element (1,1) contains the value 1 because there is only one 
instance in the input image where two horizontally adjacent 
pixels have the values 1 and 1, respectively element(l,2) 
contains the value 2 because there are two instances where 
two horizontally adjacent pixels have the values 1 and 2. 
Element (1,3) in the Gray Level Co-occurrence Matrix has the 
value 0 because there are no instances of two horizontally 
adjacent pixels with the values 1 and 3. Co-occurrence matrix 
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continues processing the input image, scanning the image for 
other pixel pairs (/, j) and recording the sums in the 

corresponding elements of the Gray level co-occurrence 
matrix. 



Figure 1: Example of Gray Level Co-occurrence Matrix 


Gray Level co-occurrence is composed of the probability 
value: it is defined by P(i, j I d , 6) which expresses the 

probability of the couple pixels at 6 direction and d interval. 
When 9 and d is determined, P(i, j I d, 0) is showed 

by Pij ■ 


Distinctly Gray Level Co-occurrence Matrix is a symmetry 
matrix; its level is determined by the image gray-level. 
Elements in the matrix are computed by the equation showed 
as follow: 


p(i,j I d,6) 


p(i,j I d,0) 

256 256 

i =1 7=1 



Gray Level Co-occurrence Matrix expresses the texture 
feature according the correlation of the couple pixels 
Gray-Level at different positions. It quantification ally 
describes the texture feature, In this proposed method, four 
features is selected, include energy, contrast, entropy, inverse 
difference. 


256 256 

Energy: £' = XX 7’(^3 ; ) 2 (6) 

x=l y=l 


Energy is a gray-scale image texture measure of homogeneity 
changing, reflecting the distribution of image gray-scale 
uniformity of weight and texture. 

256 256 

Contrast: / = X 22 x_ O) 

*=i v=i 


Contrast is the main diagonal near the moment of inertia, 
which measure the value of the matrix is distributed and 
images of local changes in number, reflecting the image 
clarity and texture of shadow depth. A larger contrast would 
mean a deeper texture. 

256 256 

Entropy: S = — zz p(x,y)\ogp(x,y ) (8) 

x=l y=l 


Entropy measures image texture randomness, when the space 
co-occurrence matrixes for all values are equal, it achieved 
the minimum value; on the other hand, if the value of 
co-occurrence matrix is very uneven, its value is greater. 
Therefore, the maximum entropy implied by the image gray 
distribution is random. 

Inverse difference: 


256 256 


x=l y=l 1 


1 


+ (x-y) : 


- p(x,y ) 



Inverse difference measures local changes in image texture 
number. Its value in large is illustrated that image texture 
between the different regions of the lack of change and partial 
very evenly. 


II. Methodology 

The texture feature is extracted by grey co-occurrence matrix 
and co-occurrence matrix in which the results of those two 
methods are used in the Euclidean Distance function to get the 
exact match of the images. 

A. Image Database 

The experimental data set contains 1000 images from the 
Corel database [14]. The Images divided into 10 categories 
and each category contains 100 images of size 256x384 or 
384x256. 

B. Image to Feature Vector 



Figure 4: Derivation of the Feature Vector of GLCM 


The above figure represents the extraction of texture features 
using GLCM. In the extraction of the feature vector process, 
the RGB images are converted to grey scale images. The 
GLCM method creates a symmetric matrix composed of the 
probability value based on the distance and the direction 
amongst the pixels of the image. The level of the images is 
determined by the image grey level. From the matrix obtained 
by GLCM the statistical features such as Energy, Contrast, 
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Entropy, inverse difference (6)-(9) are computed to form a 
4-dimensional texture feature. 



[1x88] 

Figure 5: Derivation of the Feature Vector of CCM 
The color components R, G in RGB color space I and H in 
HSV color space are respectively are extracted based on the 
co-occurrence matrix with a direction of 90°. The statistic 
features extracted from the co-occurrence matrix are as 
follows: Energy, Contrast, Entropy, Inverse difference shown 
as (6) to (9). In this method, a 16 dimensional texture feature 
is obtained from the components of R, G, H, I and their 
respective statistic values such as E, I, S and H. 

C. Feature Extraction Algorithm based on GLCM 

The following steps shows the process of how the image 
retrieval using the grey level co-occurrence matrix. 

Step 1: Separate the R, G, B planes of the images. 

Step 2: Convert the Color channel conversion R, G, B to the 
grey level scale. 

Step 3: Compute GLCM matrices as given by Equation (2) 
Step 4: Probability value of GLCM as given by Equation (5) 
Step 5: Probability value of GLCM matrix compute the 
statistical feature Energy, Entropy, Contrast, Inverse 
Difference as given by Equation (6)-(9). 

Step 6: Normalize the Energy, Entropy, Contrast, Inverse 
difference values. 

Step 7: Query image constructed by cumulative HSV color 
histogram. 

Step 8: Construct a combined feature vector for color and 
texture. 

Step 9: Calculate the Euclidean distance between the 
constructed normalized and texture feature vector of the query 
image and the database image. 

Step 10: Retrieve the first 10 most similar images with 
minimum distance. 

A query image will be converted into the grey scale after 
which it creates a GLCM matrix with the directions and 
distance between pixels, composed by the probability value. 
The statistical features, Energy, Entropy, Contrast and Inverse 
Difference are computed for each GLCM matrix. The 
similarity of each of the images is measured from the two 
types of characteristic features such as the color features and 
the texture features. The Euclidean similarity is measured to 
combine. The distance values are then sorted accordingly in 


ascending order. Display the matches showing the ten best 
images. 

D. Feature Extraction Algorithm based on CCM 

Step 1: Separate the R, G, B planes of the images. 

Step 2: Convert the Color channel conversion R, G, B to H, S, 
V scale. 

Step 3: Separate the R, G, H, I planes of the image. 

Step 4: Repeat steps 5-6 for each plane 
Step 5: Compute GLCM matrices as given by Equation (2) 
Step 6: Probability value of GLCM as given by Equation (5) 
Step 7: Probability value of the GLCM matrix compute the 
statistical feature Energy, Entropy, Contrast, and Inverse 
Difference as given by Equation (6)-(9). 

Step 8: Query image constructed by cumulative HSV color 
histogram. 

Step 7: Construct a combined feature vector for color and 
texture. 

Step 9: Find the distances between the feature vector of the 
query image and the feature vectors of the target images using 
the normalized Euclidean distance. 

Step 10: Retrieve the first 10 most similar images with 
minimum distance. 

A query image will be converted into the R, G in RGB color 
space and H, I in HSV color space and create a CCM matrix 
with the directions and distance between pixels, composed by 
the probability value. For each CCM matrix the statistical 
features such as Energy, Entropy, Contrast and Inverse 
Difference are computed. The similarity between each of the 
images is measured from two types of characteristic features 
such as color features and texture features. The Euclidean 
similarity measured to combine. The distance values are then 
sorted accordingly in ascending order. Display the matches 
showing the ten best images. 

E. Distance Calculation 

The distance between two images is used to compare and find 
the similarity between query image and the images in the 
database. Finding the distance between the feature vectors is 
similar to that of finding the similarity between the feature 
vectors. In this method the proposed method used the 
Euclidean distance between the two feature vectors. 

Let P = (p l ,p 2 ,....p n ) and Q = (q l q 2 ,...q n ) are two 

points in an n - dimensional space. Then the distance can be 
calculated as follows: The Euclidean distance between two 
vectors P and Q is defined as 

d(P,Q) = Ep ] -qX 1 HP 2 -?2> 2 +- + (Pn -^n) 2 ( 1Q ) 

F. Method of Evaluation 

The feature vectors of all the images are calculated using 
HSV, GLCM and CCM. The resultant feature vectors are then 
stored in the database for further comparison. In the proposed 
system the retrieved image is compared with the exact image 
from the same category of the query image Q. The accuracy is 

calculated by the equation (11). Let N returned is the number of 

images that are returned to the user after a query has been 

made. Out of the N retumed image, N corrent is the number of 
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images that belongs to the same category as the query image 
Q. Precision P for a query image Q is defined as 

N 

p correct ^qq (11) 

U M 

returned 

The greater the value of value P, the more accurate is the 
system. 


III. RESULT AND DISCUSSION 
A. Graphical User Interface 

MATLAB was used to develop the frontend GUI for the IR 
application. Figure 7 shows a screenshot taken from the 
application. 


Database Panel 
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10 jpg 


1 00 jpg 
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1 05.jpg 
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Browse 


select image 


load database 


Information Panel 
Top 5 Matches: 

Top 10 Matches: 

Top 20 Matches: 

Top 50 Matches: 
Top 100 Matches 


5 (I 00%) 

1 0 (1 00 %) 

1 9 (95%) 

38 (76%) 
56 (56%) 


,— Query Method 



Reset 


Exit 


Figure 7: GUI of the CBIR System for user control 
The Image Retrieval application provides user with two 
options to query an image. The user can click on “Browse” 
and select a folder which lists all the images in the list-box, or 
click on “Select Image” and click the “load database”. The 
user can select the Query Method. The system will perform 
the necessary processing and display ten best matched 
images. A green box indicates a correct returned image while 
a red box indicates a wrong image. The user could see the 
display of result. If the user wants to change the image can 
click “Reset” or Click “Exit” to exit the application. 

B. Performance Evaluation 

Table 1 shows the overall average precision for top 10 
images. An overall precision by GLCM is 82.92 percent and 
CCM is 82.7 percent. 


Table 1: Percentage of image retrieval GLCM vs. CCM 


Category 

GLCM 

CCM 

Africans 

85.3 

86.33 

Beaches 

65 

63 

Monuments 

74 

76 

Buses 

93 

95.33 

Dinosaurs 

98 

98.33 

Elephants 

69.33 

71.3 

Flowers 

97 

95.33 

Horses 

94 

93.33 

Mountains 

69.3 

65 

Food 

84.33 

84 

Average Precision 

82.92 

82.7 


Table 2 Average precision by GLCM for top different number 
_ retrieved images _ 


Category 

Gray Level Co-occurrence Matrix 

TOP 

5 

TOP 

10 

TOP 

20 

TOP 

50 

TOP 

100 

African 

93.3 

85.3 

80 

65 

52.1 

Beaches 

72 

65 

55.16 

42.12 

34.2 

Monuments 

82.6 

74 

62.1 

44.66 

33.06 

Buses 

92.66 

93 

81.5 

76.33 

59.8 

Dinosaurs 

98 

98 

95.8 

85.06 

63.26 

Elephants 

76.6 

69.33 

53.16 

37.33 

27.13 

Flowers 

98 

97 

93.1 

77.6 

53.46 

Horses 

94.6 

94 

91.5 

80.8 

77 

Mountain 

79.3 

69.3 

71.6 

44.6 

35.4 

Food 

86.66 

84.33 

74.5 

64.33 

51.26 

Average 

Precision 

96.57 

82.92 

85.42 

61.78 

48.6 


Table 3 Average precision by CCM for top different number 
_ of retrieved images _ 


Category 

Co-occurrence Matrix 

TOP 

5 

TOP 

10 

TOP 

20 

TOP 

50 

TOP 

100 

African 

88 

86.33 

81.16 

66.46 

52.6 

Beaches 

72 

63 

53.83 

43.73 

36.5 

Monuments 

82 

76 

61.6 

44.53 

34.43 

Buses 

91.33 

95.33 

87 

75.93 

60.53 

Dinosaurs 

98 

98.33 

98.16 

94.86 

75.8 

Elephants 

84 

71.3 

56.33 

95.6 

46.3 

Flowers 

97.3 

95.33 

85.5 

95.6 

46.3 

Horses 

93.33 

93.33 

95.5 

87.73 

76.93 

Mountain 

72.66 

65 

56.66 

45.2 

39.6 

Food 

91.33 

84 

82.66 

63.55 

48.06 

Average 

Precision 

86.9 

82.7 

66.84 

65.7 

49.87 


It can be seen that GLCM given better result value than the 
CCM for top retrievals. 

IV. Conclusion 

This proposed method provides an approach based on HSV 
color space and texture characteristics of the image retrieval. 
The similar two types of characteristic measure color and 
texture features. Through the quantification of HSV color 
space, we combine color features and gray-level 
co-occurrence matrix as well as co-occurrence matrix 
separately, using normalized Euclidean distance classifier. 
Through the image retrieval experiment, indicating that the 
use of color features and texture characteristic of the image 
retrieval method is superior to a single color image retrieval 
methods, and color characteristics combining color texture 
features for the integrated characteristic of color image 
retrieval has obvious advantages retrieval. In future, we can 
expect to use color and shape for the retrieval of the images. 
We have to find the other ways to reduce the computational 
cost but without reducing the accuracy. 
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